Spectral maxima representation for robust automatic speech recognition

نویسندگان

J. Sujatha

K. R. Prasanna Kumar

K. R. Ramakrishnan

N. Balakrishnan

چکیده

In the context of automatic speech recognition, the popular Mel Frequency Cepstral Coefficients(MFCC) as features, though perform very well under clean and matched environments, are observed to fail in mismatched conditions.The spectral maxima are often observed to preserve their locations and energies under noisy environments, but are not presented explicitly by the MFCC features. This paper presents a framework for representing the maxima information for robust recognition in the presence of additive White Gaussian Noise(WGN). For the task of phoneme based Isolated Word Recognition (IWR) under different Signal to Noise Ratio (SNR) environments, the results show an improved recognition performance. The cepstral features are computed from a reconstructed spectrogram by fitting gaussians around the spectral maxima. In view of the inherent robustness and easy trackability of the maxima, this opens up interesting avenues towards a robust feature representation as well as preprocessing techniques.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

Robust Noise Estimation Applied to Different Speech Estimators

In this paper we present a robust noise estimation for speech enhancement algorithms. The robust noise estimation based on a modified minima controlled recursive averaging noise estimator was applied to different speech estimators. The investigated speech estimators were spectral substraction (SS), log spectral amplitude speech estimator (LSA) and optimally modified log spectral amplitude estim...

متن کامل

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

Gradient Based Spectral Peak Location for Noise Robust Speech Recognition

In this paper a gradient-based algorithm for finding spectral peak locations is presented. The algorithm makes use of gradient and acceleration locations in the spectrogram for locating the peaks. Use of frequency gradients and accelerations locate peaks. The results are then interpolated to yield a smooth peak envelope. The method is evaluated in the aurora framework. A first pass locates all ...

متن کامل

Recognizing the message and the messenger: biomimetic spectral analysis for robust speech and speaker recognition

Humans are quite adept at communicating in presence of noise. However most speech processing systems, like automatic speech and speaker recognition systems, suffer from a significant drop in performance when speech signals are corrupted with unseen background distortions. The proposed work explores the use of a biologically-motivated multi-resolution spectral analysis for speech representation....

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2003

Spectral maxima representation for robust automatic speech recognition

نویسندگان

چکیده

منابع مشابه

Improving the performance of MFCC for Persian robust speech recognition

Robust Noise Estimation Applied to Different Speech Estimators

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Gradient Based Spectral Peak Location for Noise Robust Speech Recognition

Recognizing the message and the messenger: biomimetic spectral analysis for robust speech and speaker recognition

عنوان ژورنال:

اشتراک گذاری